Identifying minimally acceptable interpretive performance criteria for screening mammography.
نویسندگان
چکیده
PURPOSE To develop criteria to identify thresholds for minimally acceptable physician performance in interpreting screening mammography studies and to profile the impact that implementing these criteria may have on the practice of radiology in the United States. MATERIALS AND METHODS In an institutional review board-approved, HIPAA-compliant study, an Angoff approach was used in two phases to set criteria for identifying minimally acceptable interpretive performance at screening mammography as measured by sensitivity, specificity, recall rate, positive predictive value (PPV) of recall (PPV(1)) and of biopsy recommendation (PPV(2)), and cancer detection rate. Performance measures were considered separately. In phase I, a group of 10 expert radiologists considered a hypothetical pool of 100 interpreting physicians and conveyed their cut points of minimally acceptable performance. The experts were informed that a physician's performance falling outside the cut points would result in a recommendation to consider additional training. During each round of scoring, all expert radiologists' cut points were summarized into a mean, median, mode, and range; these were presented back to the group. In phase II, normative data on performance were shown to illustrate the potential impact cut points would have on radiology practice. Rescoring was done until consensus among experts was achieved. Simulation methods were used to estimate the potential impact of performance that improved to acceptable levels if effective additional training was provided. RESULTS Final cut points to identify low performance were as follows: sensitivity less than 75%, specificity less than 88% or greater than 95%, recall rate less than 5% or greater than 12%, PPV(1) less than 3% or greater than 8%, PPV(2) less than 20% or greater than 40%, and cancer detection rate less than 2.5 per 1000 interpretations. The selected cut points for performance measures would likely result in 18%-28% of interpreting physicians being considered for additional training on the basis of sensitivity and cancer detection rate, while the cut points for specificity, recall, and PPV(1) and PPV(2) would likely affect 34%-49% of practicing interpreters. If underperforming physicians moved into the acceptable range, detection of an additional 14 cancers per 100000 women screened and a reduction in the number of false-positive examinations by 880 per 100000 women screened would be expected. CONCLUSION This study identified minimally acceptable performance levels for interpreters of screening mammography studies. Interpreting physicians whose performance falls outside the identified cut points should be reviewed in the context of their specific practice settings and be considered for additional training.
منابع مشابه
Mammography Facility Characteristics Associated With Interpretive Accuracy of Screening Mammography
BACKGROUND Although interpretive performance varies substantially among radiologists, such variation has not been examined among mammography facilities. Understanding sources of facility variation could become a foundation for improving interpretive performance. METHODS In this cross-sectional study conducted between 1996 and 2002, we surveyed 53 facilities to evaluate associations between fa...
متن کاملVariability of interpretive accuracy among diagnostic mammography facilities.
BACKGROUND Interpretive performance of screening mammography varies substantially by facility, but performance of diagnostic interpretation has not been studied. METHODS Facilities performing diagnostic mammography within three registries of the Breast Cancer Surveillance Consortium were surveyed about their structure, organization, and interpretive processes. Performance measurements (false-...
متن کاملAddressing the Challenge of Assessing Physician-Level Screening Performance: Mammography as an Example
BACKGROUND Motivated by the challenges in assessing physician-level cancer screening performance and the negative impact of misclassification, we propose a method (using mammography as an example) that enables confident assertion of adequate or inadequate performance or alternatively recognizes when more data is required. METHODS Using established metrics for mammography screening performance...
متن کاملRadiologist characteristics associated with interpretive performance of diagnostic mammography.
BACKGROUND Extensive variability has been noted in the interpretive performance of screening mammography; however, less is known about variability in diagnostic mammography performance. METHODS We examined the performance of 123 radiologists who interpreted 35895 diagnostic mammography examinations that were obtained to evaluate a breast problem from January 1, 1996, through December 31, 2003...
متن کاملPerformance assessment for radiologists interpreting screening mammography.
When interpreting screening mammograms radiologists decide whether suspicious abnormalities exist that warrant the recall of the patient for further testing. Previous work has found significant differences in interpretation among radiologists; their false-positive and false-negative rates have been shown to vary widely. Performance assessments of individual radiologists have been mandated by th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Radiology
دوره 255 2 شماره
صفحات -
تاریخ انتشار 2010